Goto

Collaborating Authors

 navair public release


Continuous Mean-Zero Disagreement-Regularized Imitation Learning (CMZ-DRIL)

arXiv.org Artificial Intelligence

Machine-learning paradigms such as imitation learning and reinforcement learning can generate highly performant agents in a variety of complex environments. However, commonly used methods require large quantities of data and/or a known reward function. This paper presents a method called Continuous Mean-Zero Disagreement-Regularized Imitation Learning (CMZ-DRIL) that employs a novel reward structure to improve the performance of imitation-learning agents that have access to only a handful of expert demonstrations. CMZ-DRIL uses reinforcement learning to minimize uncertainty among an ensemble of agents trained to model the expert demonstrations. This method does not use any environment-specific rewards, but creates a continuous and mean-zero reward function from the action disagreement of the agent ensemble. As demonstrated in a waypoint-navigation environment and in two MuJoCo environments, CMZ-DRIL can generate performant agents that behave more similarly to the expert than primary previous approaches in several key metrics.


Assurance for Deployed Continual Learning Systems

arXiv.org Artificial Intelligence

The future success of the Navy will depend, in part, on artificial intelligence. In practice, many artificially intelligent algorithms, and in particular deep learning models, rely on continual learning to maintain performance in dynamic environments. The software requires adaptation to maintain its initial level of performance in unseen situations. However, if not monitored properly, continual learning may lead to several issues including catastrophic forgetting in which a trained model forgets previously learned tasks when being retrained on new data. The authors created a new framework for safely performing continual learning with the goal of pairing this safety framework with a deep learning computer vision algorithm to allow for safe and high-performing automatic deck tracking on carriers and amphibious assault ships. The safety framework includes several features, such as an ensemble of convolutional neural networks to perform image classification, a manager to record confidences and determine the best answer from the ensemble, a model of the environment to predict when the system may fail to meet minimum performance metrics, a performance monitor to log system and domain performance and check against requirements, and a retraining component to update the ensemble and manager to maintain performance. The authors validated the proposed method using extensive simulation studies based on dynamic image classification. The authors showed the safety framework could probabilistically detect out of distribution data. The results also show the framework can detect when the system is no longer performing safely and can significantly extend the working envelope of an image classifier.


Adaptive Neural Networks Using Residual Fitting

arXiv.org Artificial Intelligence

Current methods for estimating the required neural-network size for a given problem class have focused on methods that can be computationally intensive, such as neural-architecture search and pruning. In contrast, methods that add capacity to neural networks as needed may provide similar results to architecture search and pruning, but do not require as much computation to find an appropriate network size. Here, we present a network-growth method that searches for explainable error in the network's residuals and grows the network if sufficient error is detected. We demonstrate this method using examples from classification, imitation learning, and reinforcement learning. Within these tasks, the growing network can often achieve better performance than small networks that do not grow, and similar performance to networks that begin much larger.


A Software Tool for Evaluating Unmanned Autonomous Systems

arXiv.org Artificial Intelligence

The North Carolina Agriculture and Technical State University (NC A&T) in collaboration with Georgia Tech Research Institute (GTRI) has developed methodologies for creating simulation-based technology tools that are capable of inferring the perceptions and behavioral states of autonomous systems. These methodologies have the potential to provide the Test and Evaluation (T&E) community at the Department of Defense (DoD) with a greater insight into the internal processes of these systems. The methodologies use only external observations and do not require complete knowledge of the internal processing of and/or any modifications to the system under test. This paper presents an example of one such simulation-based technology tool, named as the Data-Driven Intelligent Prediction Tool (DIPT). DIPT was developed for testing a multi-platform Unmanned Aerial Vehicle (UAV) system capable of conducting collaborative search missions. DIPT's Graphical User Interface (GUI) enables the testers to view the aircraft's current operating state, predicts its current target-detection status, and provides reasoning for exhibiting a particular behavior along with an explanation of assigning a particular task to it.